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(57) Abstract: Methods and system for automated inference of physico-chemical interaction knowledge from databases of term 
jj^ co-occurrence data. The co-occurrence data includes co-occurrences between chemical or biological molecules or co-occurrences 
1^ between chemical or biological molecules and biological processes. Likelihood statistics are determined and applied to decide if 

co-occurrence data reflecting physico-chemical interactions is non-trivial. A next node or an unknown target representing chemical 
_ or biological molecules in a biological pathway is selected based on co-occurrence values. The method and system may be used to 
further facilitate a user's understanding of biological functions, such as cell functions, to design experiments more intelligently and 

to analyze experimental results more thoroughly. Specifically, the present invention may help drug discovery scientists select better 

targets for pharmaceutical intervention in the hope of curing diseases. The method and system may also help facilitate the abstraction 
1^ of knowledge from information for biological experimental data and provide new bioinformatic techniques. 
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RELATED-ACC-NO: 2001-496878 
ABSTRACTED-PUB-NO: US20020002559A 
BASIC-ABSTRACT: 

NOVELTY - A strength of co-occurrence data is measured by extracting at least 
two chemical or biological molecule names from database record; and determining 
likelihood statistic for co-occurrence reflecting physico-chemical interactions 
between the two molecule names, and applying it to the co-occurrence to 
determine if co-occurrence between the molecule names is non-trivial. 

DETAILED DESCRIPTION - Strength measurement of co-occurrence data involves 
extracting at least two chemical or biological molecule names from database 
record from an interference database; determining likelihood statistic for 
co-occurrence reflecting physico-chemical interactions between the two molecule 
names (A and B); and applying the likelihood statistic to the co-occurrence to 
determine if the co-occurrence between molecule A and molecule B is 
non-trivial. The interference database includes those records created from an 
indexed literature database. The two molecule names co-occur in at least one 
record in an indexed scientific literature database. 

An INDEPENDENT CLAIM is also included for: 

(1) a method of contextual querying of co-occurrence data comprising selecting 
a target node from a first list of nodes connected by arcs in a connection 
network; creating a second list of nodes by considering other nodes that are 
neighbors of the target node and other nodes in prior to the target node in the 
connection network; selecting a next node from the second list of nodes using 
the co-occurrence values, in which the next node is next after the target node 
in the pre-determined order for the connection network based on the 
co-occurrence values; 

(2) method of query polling of co-occurrence data comprising selecting a 
position in connection network for an unknown target node from a first list of 
nodes; determining a second list of nodes prior to the position of unknown 
target node in the connection network; determining a third list of nodes 
subsequent to the position of unknown target node in the connection network; 
determining a fourth list of nodes included in both the second and the third 
lists of nodes; and determining an identity for the unknown target node by 
selecting a node from the fourth list of nodes using likelihood statistic; and 

(3) a method for creating automated biological interferences comprising 
constructing a connection network using at least one database record from an 
interference database; applying likelihood statistics analysis methods to the 
connection network; generating automatically at least one biological 
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interferences relationships between chemical or biological molecules or 
biological processes using the results from the likelihood statistic analysis 
methods. 

USE - The method is for automated interference of physico-chemical interaction 
knowledge from databases of term co-occurrence data. It can also be used to 
facilitate a user's understanding of biological functions, e.g. cell functions, 
to design experiments, and to analyze experiment results. 

ADVANTAGE - The method helps drug discovery scientists select better targets 
for pharmaceutical intervention of curing diseases. It may also help 
facilitate the abstraction of knowledge from information for biological 
experimental data and provides new bioinformatic techniques. 

ABSTRACTED-PUB-NO: US20020004792A 

EQUIVALEinT-ABSTRACTS: 

NOVELTY - A strength of co-occurrence data is measured by extracting at least 
two chemical or biological molecule names from database record; and determining 
likelihood statistic for co-occurrence reflecting physico-chemical interactions 
between the two molecule names, and applying it to the co-occurrence to 
determine if co-occurrence between the molecule names is non-trivial. 

DETAILED DESCRIPTION - Strength measurement of co-occurrence data involves 
extracting at least two chemical or biological molecule names from database 
record from an interference database; determining likelihood statistic for 
co-occurrence reflecting physico-chemical interactions between the two molecule 
names (A and B); and applying the likelihood statistic to the co-occurrence to 
determine if the co-occurrence between molecule A and molecule B is 
non-trivial. The interference database includes those records created from an 
indexed literature database. The two molecule names co-occur in at least one 
record in an indexed scientific literature database. 

An INDEPENDENT CLAIM is also included for: 

(1) a method of contextual querying of co-occurrence data comprising selecting 
a target node from a first list of nodes connected by arcs in a connection 
network; creating a second list of nodes by considering other nodes that are 
neighbors of the target node and other nodes in prior to the target node in the 
connection network; selecting a next node from the second list of nodes using 
the co-occurrence values, in which the next node is next after the target node 
in the pre-determined order for the connection network based on the 
co-occurrence values; 

(2) method of query polling of co-occurrence data comprising selecting a 
position in connection network for an unknown target node from a first list of 
nodes; determining a second list of nodes prior to the position of unknown 
target node in the connection network; determining a third list of nodes 
subsequent to the position of unknown target node in the connection network; 
determining a fourth list of nodes included in both the second and the third 
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lists of nodes; and determining an identity for the unknown target node by 
selecting a node from the fourth list of nodes using likelihood statistic; and 

(3) a method for creating automated biological interferences comprising 
constructing a connection network using at least one database record from an 
interference database; applying likelihood statistics analysis methods to the 
connection network; generating automatically at least one biological 
interferences relationships between chemical or biological molecules or 
biological processes using the results from the likelihood statistic analysis 
methods. 

USE - The method is for automated interference of physico-chemical interaction 
knowledge from databases of term co-occurrence data. It can also be used to 
facilitate a user's understanding of biological functions, e.g. cell functions, 
to design experiments, and to analyze experiment results. 

AB VANTAGE - The method helps drug discovery scientists select better targets 
for pharmaceutical intervention of curing diseases. It may also help 
facilitate the abstraction of knowledge from information for biological 
experimental data and provides new bioinformatic techniques, 

NOVELTY - A strength of co-occurrence data is measured by extracting at least 
two chemical or biological molecule names from database record; and determining 
likelihood statistic for co-occurrence reflecting physico-chemical interactions 
between the two molecule names, and applying it to the co-occurrence to 
determine if co-occurrence between the molecule names is non-trivial. 

DETAILED DESCRIPTION - Strength measurement of co-occurrence data involves 
extracting at least two chemical or biological molecule names from database 
record from an interference database; determining likelihood statistic for 
co-occurrence reflecting physico-chemical interactions between the two molecule 
names (A and B); and applying the likelihood statistic to the co-occurrence to 
determine if the co-occurrence between molecule A and molecule B is 
non-trivial. The interference database includes those records created from an 
indexed literature database. The two molecule names co-occur in at least one 
record in an indexed scientific literature database. 

An INDEPENDENT CLAIM is also included for: 

(1) a method of contextual querying of co-occurrence data comprising selecting 
a target node from a first list of nodes connected by arcs in a connection 
network; creating a second list of nodes by considering other nodes that are 
neighbors of the target node and other nodes in prior to the target node in the 
connection network; selecting a next node from the second list of nodes using 
the co-occurrence values, in which the next node is next after the target node 
in the pre-determined order for the connection network based on the 
co-occurrence values; 

(2) method of query polling of co-occurrence data comprising selecting a 
position in connection network for an unknown target node from a first list of 
nodes; determining a second list of nodes prior to the position of unknown 
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target node in the connection network; determining a third list of nodes 
subsequent to the position of unknown target node in the connection network; 
determining a fourth list of nodes included in both the second and the third 
lists of nodes; and determining an identity for the unknown target node by 
selecting a node from the fourth list of nodes using likelihood statistic; and 

(3) a method for creating automated biological interferences comprising 
constructing a connection network using at least one database record from an 
interference database; applying likelihood statistics analysis methods to the 
connection network; generating automatically at least one biological 
interferences relationships between chemical or biological molecules or 
biological processes using the results from the likelihood statistic analysis 
methods. 

USE - The method is for automated interference of physico-chemical interaction 
knowledge from databases of term co-occurrence data. It can also be used to 
facilitate a user's understanding of biological functions, e.g. cell functions, 
to design experiments, and to analyze experiment results. 

ADVANTAGE - The method helps drug discovery scientists select better targets 
for pharmaceutical intervention of curing diseases. It may also help 
facilitate the abstraction of knowledge from information for biological 
experimental data and provides new bioinformatic techniques. 
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